Autotuning divide-and-conquer stencil computations
نویسندگان
چکیده
This paper explores autotuning strategies for serial divide-and-conquer stencil computations, comparing the efficacy of traditional “heuristic” autotuning with that of “pruned-exhaustive” autotuning. We present a pruned-exhaustive autotuner called Ztune that searches for optimal divide-and-conquer trees for stencil computations. Ztune uses three pruning properties — space-time equivalence, divide subsumption, and favored dimension — that greatly reduce the size of the search domain without significantly sacrificing the quality of the autotuned code. We compared the performance of Ztune with that of a state-of-the-art heuristic autotuner called OpenTuner in tuning the divide-and-conquer algorithm used in Pochoir stencil compiler. Over a nightly run on ten application benchmarks across two machines with different hardware configurations, the Ztuned code ran 5%–12% faster on average, and the OpenTuner tuned code ran from 9% slower to 2% faster on average, than Pochoir’s default code. In the best case, the Ztuned code ran 40% faster, and the OpenTuner tuned code ran 33% faster than Pochoir’s code. Whereas the autotuning time of Ztune for each benchmark could be measured in minutes, to achieve comparable results, the autotuning time of OpenTuner was typically measured in hours or days. Surprisingly, for some benchmarks, Ztune actually autotuned faster than the time it takes to perform the stencil computation once. Copyright © 0000 John Wiley & Sons, Ltd.
منابع مشابه
Autotuning Divide-and-Conquer Matrix-Vector Multiplication
Divide and conquer is an important concept in computer science. It is used ubiquitously to simplify and speed up programs. However, it needs to be optimized, with respect to parameter settings for example, in order to achieve the best performance. The problem boils down to searching for the best implementation choice on a given set of requirements, such as which machine the program is running o...
متن کاملFree Vibration Analysis of Repetitive Structures using Decomposition, and Divide-Conquer Methods
This paper consists of three sections. In the first section an efficient method is used for decomposition of the canonical matrices associated with repetitive structures. to this end, cylindrical coordinate system, as well as a special numbering scheme were employed. In the second section, divide and conquer method have been used for eigensolution of these structures, where the matrices are in ...
متن کاملHierarchical Computation in the SPMD Programming Model
Large-scale parallel machines are programmed mainly with the single program, multiple data (SPMD) model of parallelism. While this model has advantages of scalability and simplicity, it does not fit well with divide-and-conquer parallelism or hierarchical machines that mix shared and distributed memory. In this paper, we define the recursive single program, multiple data model (RSPMD) that exte...
متن کاملAn E cient Graph Coloring Algorithm for Stencil-based Jacobian Computations
Numerical methods for the solution of partial di↵erential equations constitute an important class of techniques in scientific computing. Often, the discretization is based on approximating the partial derivatives by finite di↵erences on a regular Cartesian grid. The resulting computations are structured in the sense of updating a large, multidimensional array by a stencil operation. A stencil d...
متن کاملStencil-Aware GPU Optimization of Iterative Solvers
Numerical solutions of nonlinear partial differential equations frequently rely on iterative Newton-Krylov methods, which linearize a finite-difference stencil-based discretization of a problem, producing a sparse matrix with regular structure. Knowledge of this structure can be used to exploit parallelism and locality of reference on modern cache-based multiand manycore architectures, achievin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 29 شماره
صفحات -
تاریخ انتشار 2017